In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

knitr::opts_chunk$set(echo = FALSE)
library(tidyverse)
library(knitr)
hook_output <- knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
  lines <- options$output.lines
  if (is.null(lines)) {
    return(hook_output(x, options))  # pass to default hook
  }
  x <- unlist(strsplit(x, "\n"))
  more <- "..."
  if (length(lines)==1) {        # first n lines
    if (length(x) > lines) {
      # truncate the output, but add ....
      x <- c(head(x, lines), more)
    }
  } else {
    x <- c(more, x[lines], more)
  }
  # paste these lines together
  x <- paste(c(x, ""), collapse = "\n")
  hook_output(x, options)
})

Workshop

You will be using the scRNAseq dataset to answer all the following questions

Add your answers (text and/or code chunks as required) in the space after each question

The scRNAseq dataset can be found in the scRNAseq package. To load it, run the following

devtools::install_github("devangthakkar/scRNAseq")
library(scRNAseq)

Please answer the following questions

What type of object is cell_1?

class(scRNAseq$cell_1)

Filter the dataset to only include genes that are specifically expressed in the brain or in the heart and assign it to a new dataset organ_data. How many genes does organ_data have?

organ_data <- scRNAseq %>%
              filter(organ == "heart" | organ == "brain")
organ_data

Using the organ_data dataset, select the following columns - gene_name, organ, and any column with the word cell in the column name. Store the result into a new dataset filtered_data. How many columns does filtered_data have?

filtered_data <- organ_data %>%
                 select(gene_name | organ | contains("cell"))
filtered_data

Sort the filtered_data dataset from Z to A based on gene_name. Store the result into a new dataset sorted_data.

sorted_data <- filtered_data %>% 
               arrange(desc(gene_name))
sorted_data

Using the sorted_data dataset, we now want to look at what organ does cell_1 originate from. Group the data set by the organ, and summarize the mean counts for cell_1

sorted_data %>% group_by(organ) %>%
                summarize(mean(cell_1))

BONUS: Identify the origin to which all the 10 cells belong. You will need to use the across functionality to select all columns that contain the word cell. Check out the documentation here: https://dplyr.tidyverse.org/reference/across.html

sorted_data %>% group_by(organ) %>% 
                summarise(across(contains("cell"), mean))

hirscheylab/tidybiology documentation built on May 20, 2022, 10:55 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hirscheylab/tidybiology
Utility Functions and Data Sets for Tidy Biology

In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

Workshop

Please answer the following questions

R Package Documentation

Browse R Packages

We want your feedback!

hirscheylab/tidybiology Utility Functions and Data Sets for Tidy Biology

In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

Workshop

Please answer the following questions

R Package Documentation

Browse R Packages

We want your feedback!

hirscheylab/tidybiology
Utility Functions and Data Sets for Tidy Biology